Multi-Modal Automation for Human Interactive Skill Assessment

ABSTRACT

A method for screening applicants by a screening entity includes the steps of remotely accessing a screening entity&#39;s multi-modal pre-screening tool by a candidate using an Internet-accessible device, the pre-screening tool comprising audio and visual elements operable to communicatively interact with the candidate, and carrying out an interactive skills assessment of the candidate by communicating at least one cue to the candidate from the screening entity, requiring the candidate, upon receiving the at least one cue, to simultaneously interact with the pre-screening tool and, based upon this interaction, communicate a verbal response to the cue, recording the candidate&#39;s verbal response with the pre-screening tool and storing the verbal response in a database profile associated with the candidate and accessible by the screening entity, and analyzing the candidate&#39;s recorded response by the screening entity and carrying out a quality and criteria judgment to determine the hiring potential of the candidate.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a divisional application of U.S. patent application Ser. No. 12/116,433 filed May 7, 2008, which application claims the priority, under 35 U.S.C. §119, of U.S. Provisional Patent Application No. 60/928,895 filed May 11, 2007, the entire disclosures of which are hereby incorporated herein by reference in their entireties.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

n/a

FIELD OF THE INVENTION

The present invention lies in the field of off-site customer support, in particular, in the field of identifying qualified human agents for providing enhanced customer support. The method can be used, particularly, as a tool to assist in separating hirable agents from unhirable agents.

BACKGROUND OF THE INVENTION

Customer-support centers rely on trained human agents who possess skills that are suited for the type of service that is being delivered. There are several skill requirements that are common across call centers. One example of such desirable skills is good speech intelligibility, for a specific language, when speaking over a telephone to a customer. Another desirable skill is the ability to interact with web tools that utilize screen monitors, keyboards, and other control devices. For customers receiving assistance over the telephone, an ideal experience can be delivered by a human agent that speaks the language well, performs the proper tasks, delivers the right information, and sounds delightful while serving the customer in a timely manner.

In the context of identifying qualified human agents, it is desirable to find efficient methods for measuring one's ability to combine speaking skills and web interaction to produce good customer experience during telephone support scenarios. It is also desirable to find efficient methods for assessing a person's creativity and ability to express that creativity verbally. It is particularly desirable to find efficient ways to predict that one will deliver delightful experiences to customers seeking assistance. It is further desirable to determine if an agent has aptitude in certain areas of specialization.

BRIEF SUMMARY OF THE INVENTION

The invention provides multi-modal automation for human interactive skill assessment that overcomes the herein-mentioned disadvantages of the heretofore-known devices and methods of this general type and that, in the context of identifying qualified human agents, measures one's ability to combine speaking skills and web interaction to produce a positive customer experience during telephone support scenarios, assesses a person's creativity and ability to express that creativity verbally, predicts who is able to deliver delightful experiences to customers seeking assistance, and determines if that agent has aptitude in certain areas of specialization.

The present invention does so by helping automate the identification of human agents possessing these qualities.

The present inventive process can automatically pre-screen call center applicants based on pre-defined speech tasks. The application is multimodal and requires simultaneous telephone and Internet web page access by the applicants. There are critical benefits that become realized when telephone interaction is coupled with visual information provided through web access. In particular, applicants can read web-based scripted information into the telephone handset. The telephone speech can be recorded and analyzed in a variety of ways, including subjective human assessment and automated assessment provided by a speech recognizer. Figures, pictures, or any other form of graphics can also provide the basis for a speech task. For example, a map with a highlighted route could be displayed on a web page and the speech task for the interviewee could include speaking driving directions over the telephone. Individual driving instructions would need to be accurate (“head east on” instead of “head west on”) and the street names would need to be pronounced correctly.

The present invention is an automated screening application that identifies speech clarity, basic thought process, and experience. The invention accomplishes tasks normally performed by call center recruiters or supervisors of call center personnel and, thereby reduces the work load and/or personnel for interviewing potential candidates—employees who are, typically, highly compensated (at least when compared to the potential candidates).

The present invention has the many valuable characteristics not present by any previous automated method for interviewing qualified candidates, including, for example:

1. Highly customizable

2. Highly automated

3. Highly efficient

4. Universal access, any language

5. Content flexibility

6. Adjustable acceptance criteria

7. Remote access, any location

The inventive process is highly customizable and can be optimized for almost any customer service environment. Examples of customer service environments include computer technical support, concierges, airline reservations, utilities, telemarketing, car rentals, vacation planning, roadside assistance, and home security. The type of dialogue that is recorded can be directly correlated to the line of business. In a later discussion, examples will serve to illustrate the various types of dialogue that are recorded.

The application is highly automated and consumes a minimum amount of applicant time and analysis time by human resource personnel. Using multi-modal automation, speech tasks, closely tied to audio and visual cues, are recorded and analyzed to evaluate candidates. Although the examples provided in this document are in English, the application can be made available in any language and in various combinations of languages where such skills are also being evaluated.

Because the inventive process is automated and available electronically, e.g., over the Internet, the program is accessible by applicants at any time, and from any location, as long as Internet and telephone access exist. There is no scheduling required and the application can run stand alone without human supervision. From the human resources side, applicant data can be reviewed at any time after being stored and can be configured to only require appropriate World Wide Web access including the ability to listen to recorded audio, e.g., through an audio wave file player.

With the foregoing and other objects in view, there is provided, in accordance with the invention, a method for screening applicants by a screening entity includes the steps of remotely accessing a screening entity's multi-modal pre-screening tool by a candidate using an Internet-accessible device, the pre-screening tool comprising audio and visual elements operable to communicatively interact with the candidate and carrying out an interactive skills assessment of the candidate by communicating at least one cue to the candidate from the screening entity, requiring the candidate, upon receiving the at least one cue, to simultaneously interact with the pre-screening tool and, based upon this interaction, communicate a verbal response to the cue, recording the candidate's verbal response with the pre-screening tool and storing the verbal response in a database profile associated with the candidate and accessible by the screening entity, and analyzing the candidate's recorded response by the screening entity and carrying out a quality and criteria judgment to determine the hiring potential of the candidate.

With the objects of the invention in view, there is also provided a method for screening applicants by a screening entity includes the steps of remotely accessing a screening entity's multi-modal pre-screening tool by a candidate using an Internet-accessible device, the pre-screening tool comprising audio and visual elements operable to communicatively interact with the candidate and carrying out an interactive skills assessment of the candidate with the pre-screening tool by communicating an instruction to the candidate from the screening entity, requiring the candidate, upon receiving the instruction, to perform a pre-defined speech task by simultaneously viewing at least one of textual and pictorial content displayed to the candidate using the visual element of the pre-screening tool and communicating, with the audio element of the pre-screening tool, a verbal response from the candidate to the screening entity based upon the displayed content, recording the candidate's verbal response with the pre-screening tool and storing the verbal response in a database profile associated with the candidate and accessible by the screening entity, and analyzing the candidate's recorded response by the screening entity and carrying out a quality and criteria judgment to determine the hiring potential of the candidate.

In accordance with another mode of the invention, the communicating steps are carried out by communicating the at least one cue to the candidate using at least the audio element of the pre-screening tool and communicating the candidate's verbal response to the screening entity using the audio element of the pre-screening tool.

In accordance with a further mode of the invention, the candidate's interaction is carried out with the pre-screening tool using the visual element.

In accordance with an added mode of the invention, the candidate's interaction using the visual element of the pre-screening tool comprises viewing at least one of textual and pictorial content displayed to the candidate using the visual element.

In accordance with an additional mode of the invention, the analyzing step further comprises automatically determining a confidence score of the candidate's verbal response with a computerized speech recognition tool of the pre-screening tool.

In accordance with yet another mode of the invention, the visual element of the pre-screening tool comprises an Internet web page.

In accordance with yet a further mode of the invention, the audio element of the pre-screening tool comprises a Voice-Over-Internet Protocol component.

In accordance with yet an added mode of the invention, the candidate's identity is confirmed prior to commencing the interactive skills assessment of the candidate through a set of identification questions communicated to the candidate using the pre-screening tool.

In accordance with yet an additional mode of the invention, the storing step is carried out selectively or continually.

In accordance with again another mode of the invention, the computerized speech recognition tool is programmed to parse a particular response into individual words and to either compare the parsed response to a desired response or to transcribe the parsed response for later use and access by the screening entity.

In accordance with again a further mode of the invention, the automated confidence scoring step is carried out by judging at least an accuracy and an intelligibility of the candidate's speech utilizing a target phrase represented in a speech recognition grammar.

In accordance with again an added mode of the invention, the communicating step is carried out by communicating the instruction to the candidate using at least the audio element of the pre-screening tool.

In accordance with again an additional mode of the invention, the instruction communicating step comprises prompting the candidate to view a written script displayed by the visual element of the pre-screening tool and the speech task step comprises requiring the candidate to read the written script aloud.

In accordance with still another mode of the invention, the instruction communicating step comprises prompting the candidate to verbally repeat at least one spoken phrase.

In accordance with still a further mode of the invention, the instruction communicating step comprises prompting the candidate to verbally answer at least one multiple-choice question displayed to the candidate by the visual element of the pre-screening tool.

In accordance with a concomitant mode of the invention, the instruction communicating step comprises prompting the candidate to verbally answer at least one question regarding a graphic displayed to the candidate by the visual element of the pre-screening tool.

Other features that are considered as characteristic for the invention are set forth in the appended claims.

Although the invention is illustrated and described herein as embodied in multi-modal automation for human interactive skill assessment, it is, nevertheless, not intended to be limited to the details shown because various modifications and structural changes may be made therein without departing from the spirit of the invention and within the scope and range of equivalents of the claims.

The construction and method of operation of the invention, however, together with additional objects and advantages thereof, will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Advantages of embodiments of the present invention will be apparent from the following detailed description of the preferred embodiments thereof, which description should be considered in conjunction with the accompanying drawings in which:

FIG. 1 is a block diagram of high-level components of an exemplary system architecture for carrying out the method according to the invention;

FIG. 2 is a process flow diagram of one exemplary process for carrying out the method according to the invention;

FIG. 3 is a diagrammatic representation of an exemplary web interface for receiving applicant information in the method according to the invention;

FIG. 4 is a diagrammatic representation of an exemplary web interface for carrying out verbal-applicant-screening exercises in the method according to the invention;

FIG. 5 is a diagrammatic representation of an exemplary web interface for carrying out graphic-applicant-screening exercises in the method according to the invention;

FIG. 6 is a list of an exemplary output queue of applicants to be reviewed by screening entities in the method according to the invention; and

FIG. 7 is a diagrammatic representation of an exemplary web interface for reviewing an applicant's screening results in the method according to the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Aspects of the invention are disclosed in the following description and related drawings directed to specific embodiments of the invention. Alternate embodiments may be devised without departing from the spirit or the scope of the invention. Additionally, well-known elements of exemplary embodiments of the invention will not be described in detail or will be omitted so as not to obscure the relevant details of the invention.

Before the present invention is disclosed and described, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.

While the specification concludes with claims defining the features of the invention that are regarded as novel, it is believed that the invention will be better understood from a consideration of the following description in conjunction with the drawing figures, in which like reference numerals are carried forward. The figures of the drawings are not drawn to scale.

Referring now to the figures of the drawings in detail and first, particularly to FIG. 1 thereof, there is shown an illustration of high-level components of an exemplary system architecture for carrying out the present invention. During an applicant screening process, a potential candidate 1, by operating a computer 3, accesses a link through the Internet 11, also known as the “World Wide Web,” to a server 4 that hosts an external web page. From the external web page hosted by the external server 4, an internal server 9 that hosts a web site internal to the applicant screening entity 10 is accessed. The pre-screening web site in the internal server 9 requests the applicant 1 to input responses to various queries tailored to the screening entity 10. The responses are stored for later use or, as a course of record keeping, in a database 8, for example.

Either simultaneously or thereafter, the candidate 1 is called on a telephone 2 through a private branch exchange (“PBX”) 6 over a telephone network 5. A voice server 13 initiates this call 14 to the applicant 1 automatically in response to the candidate's response. The candidate's identity can be confirmed (or not) through a set of identification questions and responses. After confirmation, an interactive process begins with the potential candidate 1. The process includes both the telephone 2 and the computer 3 having access to the Internet 11. The voice server 13 generates audio prompts to the candidate 1 and records the applicant's verbal responses. After the candidate 1 completes the exercises, a profile is stored in a database 8 for later access and analyzing by the Human Resource Department of the screening entity 10, for example, via access through an entity-secure intranet link 12. The analysis of the recorded responses assists the screening entity 10 to make quality judgments about the candidate 1.

As with other Human Resource issues, it is desirable to control access to this screening process. Internet security can be applied to applicant's access to the external server 4, and an outbound calling strategy also can control access to the screening process. During the applicant screening process, control of the number and kind of questions presented on the applicant's computer 3 and of the phone calls made to the applicant's phone 2 screens the applicant's 1 individual ability to follow instructions and to interact in a multi-modal environment, which simulates a real call-center seat. Speech is recorded selectively throughout the process (or continually) and is stored for subsequent evaluation. It is important, to every extent possible, to automate the evaluation using a confidence scoring produced by a speech recognition device; in other words, with a sufficiently sophisticated speech-recognition process, the responses can be parsed into individual words and compared to desired responses or transcribed for later use and fast access by the entity's evaluation staff. As phrases are pronounced, the voice server 13, which is able to recognize speech, judges at least two characteristics of the applicant's speech: accuracy and intelligibility. This automated confidence scoring quantitatively measures pronunciation quality, for example, for a target phrase that is represented in a speech recognition grammar.

An exemplary applicant evaluation process is explained below with reference to the process flow chart of FIG. 2 illustrating one exemplary embodiment of the present invention. The flow starts at step 200 and moves directly to step 202 where the applicant 1 connects to a secured web page and initiates the application procedure. As part of the initiation, applicant 1 can, for example, read an overview, accept terms of use, and select a <continue> option to advance to the next step in the procedure. In step 204, the applicant enters their personal identification data, for example, name, address, email address, and/or current telephone number. It is noted that this data can be confirmed directly or indirectly through a response-requiring email sent to the applicant's email address. An exemplary web-provided screen for receiving this information is illustrated in FIG. 3.

In step 206, the voice server 13 immediately initiates a phone call to applicant 1, while applicant 1 is still logged on to the web application. In step 208, a voice quality exercise is initiated. The applicant 1 is asked to repeat spoken phrases over his/her telephone 2. More specifically, specific phrases of a human or machine voice are transmitted over the phone line to applicant 1 and applicant 1 repeats these phrases/words shortly after each phrase is played. For each phrase, the applicant's speech is recorded and end-pointed for off-line analysis. Each recorded phrase is automatically scored with a confidence level that is correlated with how well the recorded phrase matches the expected pronunciation, as represented in a speech recognition grammar with highly tuned pronunciation lexicons. With such voice recognition grammar, strong accents and mispronunciations, for example, will map into low-confidence scores. Conversely, clearly spoken words with proper inflections and pronunciations will map into high-confidence scores.

In step 210, a dictation clarity exercise is carried out. To start this exercise, the applicant 1 can be instructed to continue by either selecting specific web link buttons (visual) or by pressing keys on the telephone keypad. To detect clarity of dictation, applicant 1 is required to read a script, which is sent to a particular web page viewable by the applicant 1. The applicant 1 can be given time to read and study the script before speaking the script into the telephone 2. To allow for this study time (which can be limited if desired by the screening entity), the applicant 1 will press a key on the telephone keypad (a web button can also be used with the appropriate architecture) and then dictate the provided script. Upon completion, the applicant 1 can be asked to press a telephone key or web button. Speech from the applicant is stored for subsequent off-line processing. Although confidence scoring can be applied, additional human judgment can by used because tone, volume and other acoustical characteristics are more subjective than objective and, possibly, can be best analyzed by a trained employee. In this way, the employee can score any and all aspects of how well the script was spoken by the applicant 1.

In step 212, a question-answer exercise is initiated. Reference is made to “Exercise 3” in FIG. 4, which is a sample instruction to an applicant 1. Specific multiple-choice questions are transmitted over the telephone 2 by a human or machine voice and the applicant 1 is prompted to answer each question after it occurs. For each answer, the applicant's speech is recorded and is end-pointed for off-line analysis. Each recorded phrase is automatically scored with a confidence level that is correlated to how well the recorded phrase matches the expected pronunciation, as represented in a speech recognition grammar with highly tuned pronunciation lexicons. As set forth above, strong accents and mispronunciations map into low confidence scores. If an incorrect answer is spoken, a pre-defined low confidence score will most likely be assigned. Alternatively, if a correct answer is spoken, a pre-defined high confidence score will most likely be assigned.

In step 214, a service knowledge exercise is initiated. This speaking exercise is aimed at discovering whether the applicant 1 can understand what good customer service is and whether he/she can intelligently describe such an experience. More specifically, as show in “Exercise 4” in FIG. 4, the applicant 1 is asked, for instance, to describe a delightful service that he/she has experienced. The applicant is allowed time to think of the experience and how he/she would like to describe the experience to the screening entity within a certain time limit (such as 2 minutes). It should be noted that any questions can be presented to the applicant 1 for the purpose of screening the applicant's ability to speak and respond and the present invention is not limited to only those questions related to customer-support experiences. The applicant 1, then, speaks over the telephone (or Internet) after pressing a key, for example, on the telephone keypad (a web button can also be used). Upon completion, the applicant 1 indicates that he/she is finished by pressing a telephone key or web button. The speech presented by the applicant can be stored for subsequent off-line processing. Although confidence scoring can be applied to the spoken words and sentences, here, human judgment can be given greater weight because the content of the speech will not be known ahead of time. In particular, human judgment can be used to evaluate the applicant's persona and how well the applicant's voice will sound to a customer. Additionally, human judgment can be used to score the applicant's grammar, intonation, and general talent in the area of servicing customers. Thus, human subjectivity is well suited for scoring the content and quality of what was spoken.

In step 216, a graphic comprehension exercise is initiated. This exercise screens the applicant's ability to respond to a provided graphic. For example, an image, representing the evaluation material, is displayed on a web page that is visible to the applicant 1. Instead of reading a script, repeating a phrase, answering a spoken question, or describing an experience, the applicant 1 is required to deduce answers to questions from the provided image and to speak their answers over the telephone upon being prompted to do so. For each answer, the applicant's speech is recorded and is end-pointed for off-line analysis. Each recorded phrase is automatically scored with a confidence level that is correlated with how well the recorded phrase matches the expected pronunciation, for instance, as represented in a speech recognition grammar with highly tuned pronunciation lexicons. As before, strong accents and mispronunciations map into low confidence scores. In this exercise, a word spoken incorrectly significantly reduces the confidence score, which may be even further reduced when other confidence-lowering factors are present, such as improper microphone placement, for example.

An exemplary graphic comprehension exercise is illustrated as “Exercise 5” in FIG. 5. This example screens an applicant's ability to give driving directions. The map graphic is displayed to the applicant 1. The applicant 1 is asked to give turn-by-turn driving directions and, if desired, a list of possible driving instructions. These instructions can be listed in random order (as shown) or they can be listed in order from start to destination (however, this latter approach removes the deductive reasoning and cartographic analyses that can be performed with this exercise). In the most difficult case, the applicant 1 will not be provided with instructions and will be asked to guide a virtual driver from the starting point to the destination.

In step 218, a speaking satisfaction exercise is initiated. This speaking exercise is aimed at determining the kind of experience that a customer will have after speaking to the applicant 1 (such as delight, satisfied, displeasure, honor), the experience type can be referred to as a “pleasantry factor.” The applicant is given one of a series of random situations in which they will be required to role-play an operator answering a call from a driver (random and/or coordinated selection is desired where an applicant 2 can enter the application process more than once and it is insured that a different scenario is role-played every subsequent time). It is desirable to not give the applicant 1 time to think and prepare because a “real-time” operator assistance experience is the desired output. The applicant 1 can be allowed to listen to a driver, for example, one who has just witnessed an accident, who has been involved in an accident, who has locked their child in a car, and many other scenarios, and then be asked to counsel and assist the driver. The conversation between the applicant 1 and the virtual driver is stored for subsequent off-line processing. Like step 216, only limited automatic confidence scoring can be applied to show use of grammar and pronunciation, for example. In this case, subjective human judgment is most important to score how well the potential operator dealt with the supplied situation.

Any number of other kinds of additional exercises can be performed as desired in step 220. Once all exercises are complete, the applicant 1 is informed that the application process is done and the data is ready for analysis. If desired, the applicant 1 can be given a timeframe for hearing from the screening entity or given a call number and a date for checking on his/her application. The above exercises are all not necessary or required. These exercises can occur in any order and in any combination, some of which can be eliminated if desired.

The process of the present invention now allows HR personnel 10 to review applicant's stored data 8, step 224, at any time, whether through a web access 9 or after it has been stored internally, in step 222, at the screening entity's selected data storage location. The process ends at step 226.

Many different screening entities can be allowed to access their own or any other entity's screening data. For example, where an applicant is determined to be less suitable for one kind of employment opportunity, that person's performance may be suitable for another opportunity and having the data available may be beneficial if different entities agree to share the screening exercises and recorded results. As such, an administrative web page 9 can be accessed from an intranet link 12 or from any resource connected to the Internet 11, provided that sufficient and/or desired security requirements are met. In order to review the recorded data, measures for playing recorded audio, such as audio wave files, to the human resources [or else define it] personnel 10 are needed.

After an applicant 1 completes the set of exercises, an audio profile (e.g., a web page audio profile) is automatically created, specific to that applicant 1. For automated and first-in-first-out processing of all applications received by the screening entity, applicant-specific identification data can be placed into a queue, as illustrated in FIG. 6, for example, for convenient access and processing by human resources personnel, such as over the World Wide Web. To access data regarding a specific applicant 1, a “score applicant” link can be selected. Phone numbers in FIG. 6 can be 4-digit internal extensions as well as 10-digit external phone numbers.

Upon selecting a specific applicant 1 from the queue, any information can be displayed. For example, a task description, corresponding recorded audio data, and corresponding confidence scoring 13 can be displayed for each of the exercises. For each task within an exercise, wave files (for example) are available for listening by the reviewing agent. Displayed with the wave files are associated confidence scores that range from 0.9999 to 0.0000, with the higher confidence score indicating that the pronunciation is more likely correct than not. In addition to automated scoring with confidence measures, the reviewing agent can subjectively score each wave file on a scale of 1 to 100, for example. Accents, speaking skills, and perceived personality are readily detected by a reviewing agent trained to screen such candidates based on how they sound. After listening to each wave file, a subjective score (between 1 and 100) is entered into the applicant's audio-web profile for future processing. In the example of FIG. 7, subjective scores are shown to be 100, 80, and 80, respectively, for each of three evaluated items.

For questions that require correct answers (as opposed to repeating phrases or reading text), a confidence score can serve multiple purposes, for example, a likelihood of correctness and a likelihood of correct pronunciation. For such questions, high scores are only possible when the correct answer is given and the pronunciation matches the recognizer's expected pronunciation rules, which are represented in a voice-recognition lexicon that can be optimized for specific desired pronunciations.

For most audio wave files that are recorded by the application, meaningful automated scoring is achieved by applying confidence scoring, which is important to the invention and is described, for example, in “Recognition Confidence Scoring for Use in Speech Understanding Systems” Hazen et al. 2000 (http://citeseer.ist.psu.edu/hazen00recognition.html), which is hereby incorporated herein by reference in its entirety. In fact, a completely automated screening process can be used to filter out a high percentage of applicants without human intervention. A standard of acceptance can be adjustable. Performance criteria can be completely objective. For example, just by looking at the queue of applicants, the aggregate confidence score (without human intervention) can be displayed and the applicants can be rank-ordered automatically before any human analysis of the applicant's audio data is performed. Perhaps only the top 25% of the applicants that complete the screening application will be considered for further evaluation by human intervention in one exemplary screening method.

There are several types of applicant-related tasks that can be automatically scored by applying confidence measures, including, for example:

1) repeating phrases through prompting.

2) speaking (or reading out loud) displayed text.

3) speaking answers to prompted questions; and

4) speaking answers related to image information.

It is noted that several types of cognitive processing are required to complete all of the applicant tasks successfully. These include, but are not limited to, reading, listening, speaking, knowledge of a language, analyzing images, understanding instructions, being creative, manual dexterity, and possessing relevant knowledge to answer questions. It is through the combination of applying various skills that high-confidence scoring is achieved. With appropriate pre-preparation, the process according to the present invention is capable of simulating an actual working environment. In fact, an applicant may be qualified to be a virtual agent (qualified to work remotely) by scoring high enough from “his or her” calling environment, which must include appropriate telephone and web access.

Various servers 4, 9, 13 are mentioned herein. Mentioning them separately is not a requirement to being physically separated servers. Accordingly, a single physical server can host the functions described herein as servers 4, 9, 13.

The foregoing description and accompanying drawings illustrate the principles, preferred embodiments and modes of operation of the invention. However, the invention should not be construed as being limited to the particular embodiments discussed above. Additional variations of the embodiments discussed above will be appreciated by those skilled in the art.

Therefore, the above-described embodiments should be regarded as illustrative rather than restrictive. Accordingly, it should be appreciated that variations to those embodiments can be made by those skilled in the art without departing from the scope of the invention as defined by the following claims. 

1. A method for screening applicants by a screening entity, which comprises: remotely accessing a screening entity's multi-modal pre-screening tool by a candidate using an Internet-accessible device, the pre-screening tool comprising audio and visual elements operable to communicatively interact with the candidate; and carrying out an interactive skills assessment of the candidate by: communicating at least one cue to the candidate from the screening entity; requiring the candidate, upon receiving the at least one cue, to simultaneously interact with the pre-screening tool and, based upon this interaction, communicate a verbal response to the at least one cue; recording the candidate's verbal response with the pre-screening tool and storing the verbal response in a database profile associated with the candidate and accessible by the screening entity; and analyzing the candidate's recorded response by the screening entity and carrying out a quality and criteria judgment to determine the hiring potential of the candidate.
 2. The method according to claim 1, which further comprises carrying out the communicating steps by: communicating the at least one cue to the candidate using at least the audio element of the pre-screening tool; and communicating the candidate's verbal response to the screening entity using the audio element of the pre-screening tool.
 3. The method according to claim 1, which further comprises carrying out the candidate's interaction with the pre-screening tool using the visual element.
 4. The method according to claim 3, wherein the candidate's interaction using the visual element of the pre-screening tool comprises viewing at least one of textual and pictorial content displayed to the candidate using the visual element.
 5. The method according to claim 1, wherein the analyzing step further comprises automatically determining a confidence score of the candidate's verbal response with a computerized speech recognition tool of the pre-screening tool.
 6. The method according to claim 1, wherein the visual element of the pre-screening tool comprises an Internet web page.
 7. The method according to claim 1, wherein the audio element of the pre-screening tool comprises a Voice-Over-Internet Protocol component.
 8. The method according to claim 1, further comprising confirming the candidate's identity prior to commencing the interactive skills assessment of the candidate through a set of identification questions communicated to the candidate using the pre-screening tool.
 9. The method according to claim 1, which further comprises carrying out the storing step one of selectively and continually.
 10. The method according to claim 5, wherein the computerized speech recognition tool is programmed to parse a particular response into individual words and to either compare the parsed response to a desired response or to transcribe the parsed response for later use and access by the screening entity.
 11. The method according to claim 10, which further comprises carrying out the automated confidence scoring step by judging at least an accuracy and an intelligibility of the candidate's speech utilizing a target phrase represented in a speech recognition grammar.
 12. A method for screening applicants by a screening entity, which comprises: remotely accessing a screening entity's multi-modal pre-screening tool by a candidate using an Internet-accessible device, the pre-screening tool comprising audio and visual elements operable to communicatively interact with the candidate; and carrying out an interactive skills assessment of the candidate with the pre-screening tool by: communicating an instruction to the candidate from the screening entity; requiring the candidate, upon receiving the instruction, to perform a pre-defined speech task by simultaneously: viewing at least one of textual and pictorial content displayed to the candidate using the visual element of the pre-screening tool; and communicating, with the audio element of the pre-screening tool, a verbal response from the candidate to the screening entity based upon the displayed content; recording the candidate's verbal response with the pre-screening tool and storing the verbal response in a database profile associated with the candidate and accessible by the screening entity; and analyzing the candidate's recorded response by the screening entity and carrying out a quality and criteria judgment to determine the hiring potential of the candidate.
 13. The method according to claim 12, which further comprises carrying out the communicating step by communicating the instruction to the candidate using at least the audio element of the pre-screening tool.
 14. The method according to claim 12, wherein the analyzing step further comprises automatically determining a confidence score of the candidate's verbal response with a computerized speech recognition tool of the pre-screening tool.
 15. The method according to claim 12, wherein the visual element of the pre-screening tool comprises an Internet web page.
 16. The method according to claim 12, wherein the audio element of the pre-screening tool comprises a Voice-Over-Internet Protocol component.
 17. The method according to claim 12, wherein: the instruction communicating step comprises prompting the candidate to view a written script displayed by the visual element of the pre-screening tool; and the speech task step comprises requiring the candidate to read the written script aloud.
 18. The method according to claim 12, wherein the instruction communicating step comprises prompting the candidate to verbally repeat at least one spoken phrase.
 19. The method according to claim 12, wherein the instruction communicating step comprises prompting the candidate to verbally answer at least one multiple-choice question displayed to the candidate by the visual element of the pre-screening tool.
 20. The method according to claim 12, wherein the instruction communicating step comprises prompting the candidate to verbally answer at least one question regarding a graphic displayed to the candidate by the visual element of the pre-screening tool. 